AITopics | systematic study

Collaborating Authors

systematic study

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

XTC: Extreme Compression for Pre-trained Transformers Made Simple and Efficient

Neural Information Processing SystemsDec-23-2025, 19:43:51 GMT

Extreme compression, particularly ultra-low bit precision (binary/ternary) quantization, has been proposed to fit large NLP models on resource-constraint devices. However, to preserve the accuracy for such aggressive compression schemes, cutting-edge methods usually introduce complicated compression pipelines, e.g., multi-stage expensive knowledge distillation with extensive hyperparameter tuning. Also, they oftentimes focus less on smaller transformer models that have already been heavily compressed via knowledge distillation and lack a systematic study to show the effectiveness of their methods.In this paper, we perform a very comprehensive systematic study to measure the impact of many key hyperparameters and training strategies from previous. As a result, we find out that previous baselines for ultra-low bit precision quantization are significantly under-trained. Based on our study, we propose a simple yet effective compression pipeline for extreme compression. Our simplified pipeline demonstrates that(1) we can skip the pre-training knowledge distillation to obtain a 5-layer \bert while achieving better performance than previous state-of-the-art methods, like TinyBERT; (2) extreme quantization plus layer reduction is able to reduce the model size by 50x, resulting in new state-of-the-art results on GLUE tasks.

extreme compression, name change, pre-trained transformer, (8 more...)

Neural Information Processing Systems

Genre: Research Report > Promising Solution (0.60)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.40)

Add feedback

Identifying Legal Holdings with LLMs: A Systematic Study of Performance, Scale, and Memorization

Arvin, Chuck

arXiv.org Artificial IntelligenceMay-27-2025

As large language models (LLMs) continue to advance in capabilities, it is essential to assess how they perform on established benchmarks. In this study, we present a suite of experiments to assess the performance of modern LLMs (ranging from 3B to 90B+ parameters) on CaseHOLD, a legal benchmark dataset for identifying case holdings. Our experiments demonstrate scaling effects - performance on this task improves with model size, with more capable models like GPT4o and AmazonNovaPro achieving macro F1 scores of 0.744 and 0.720 respectively. These scores are competitive with the best published results on this dataset, and do not require any technically sophisticated model training, fine-tuning or few-shot prompting. To ensure that these strong results are not due to memorization of judicial opinions contained in the training data, we develop and utilize a novel citation anonymization test that preserves semantic meaning while ensuring case names and citations are fictitious. Models maintain strong performance under these conditions (macro F1 of 0.728), suggesting the performance is not due to rote memorization. These findings demonstrate both the promise and current limitations of LLMs for legal tasks with important implications for the development and measurement of automated legal analytics and legal benchmarks.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2505.02172

Country: North America > United States > California (0.46)

Genre: Research Report > New Finding (1.00)

Industry:

Law > Civil Rights & Constitutional Law (0.47)
Government > Regional Government > North America Government > United States Government (0.47)
Law > Litigation (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.47)

Add feedback

Unveiling Redundancy in Diffusion Transformers (DiTs): A Systematic Study

Sun, Xibo, Fang, Jiarui, Li, Aoyu, Pan, Jinzhe

arXiv.org Artificial IntelligenceNov-17-2024

The increased model capacity of Diffusion Transformers (DiTs) and the demand for generating higher resolutions of images and videos have led to a significant rise in inference latency, impacting real-time performance adversely. While prior research has highlighted the presence of high similarity in activation values between adjacent diffusion steps (referred to as redundancy) and proposed various caching mechanisms to mitigate computational overhead, the exploration of redundancy in existing literature remains limited, with findings often not generalizable across different DiT models. This study aims to address this gap by conducting a comprehensive investigation into redundancy across a broad spectrum of mainstream DiT models. Our experimental analysis reveals substantial variations in the distribution of redundancy across diffusion steps among different DiT models. Interestingly, within a single model, the redundancy distribution remains stable regardless of variations in input prompts, step counts, or scheduling strategies. Given the lack of a consistent pattern across diverse models, caching strategies designed for a specific group of models may not easily transfer to others. To overcome this challenge, we introduce a tool for analyzing the redundancy of individual models, enabling subsequent research to develop tailored caching strategies for specific model architectures. The project is publicly available at https://github.com/xdit-project/DiTCacheAnalysis.

artificial intelligence, machine learning, redundancy, (13 more...)

arXiv.org Artificial Intelligence

2411.13588

Country: Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)
Information Technology > Artificial Intelligence > Vision (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.66)

Add feedback

XTC: Extreme Compression for Pre-trained Transformers Made Simple and Efficient

Neural Information Processing SystemsOct-9-2024, 21:53:19 GMT

extreme compression, knowledge distillation, pre-trained transformer, (4 more...)

Neural Information Processing Systems

Genre: Research Report > Promising Solution (0.62)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.42)

Add feedback

Reviews: Measures of distortion for machine learning

Neural Information Processing SystemsOct-7-2024, 09:55:19 GMT

SUMMARY When points in one metric space are embedded into another (eg. This paper is a systematic study of distortion measures. It formally defines several desired properties of a distortion measure, and compares existing distortion measures from the lens of those formal definitions and by simulations on synthetic data. Based on these, the pros and cons of each measure are discussed, and a new notion of distortion is suggested. COMMENTS The systematic study of desirable properties of distortion is solid and in my view constitutes the main strength of this submission.

dimension, distortion measure, literature, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

A systematic study comparing hyperparameter optimization engines on tabular data

Kegl, Balazs

arXiv.org Artificial IntelligenceNov-27-2023

We run an independent comparison of all hyperparameter optimization (hyperopt) engines available in the Ray Tune library. We introduce two ways to normalize and aggregate statistics across data sets and models, one rank-based, and another one sandwiching the score between the random search score and the full grid search score. This affords us i) to rank the hyperopt engines, ii) to make generalized and statistically significant statements on how much they improve over random search, and iii) to make recommendations on which engine should be used to hyperopt a given learning algorithm. We find that most engines beat random search, but that only three of them (HEBO, AX, and BlendSearch) clearly stand out. We also found that some engines seem to specialize in hyperopting certain learning algorithms, which makes it tricky to use hyperopt in comparison studies, since the choice of the hyperopt technique may favor some of the models in the comparison.

engine, hyperparameter optimization engine, random search, (12 more...)

arXiv.org Artificial Intelligence

2311.15854

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

A Systematic Study of Performance Disparities in Multilingual Task-Oriented Dialogue Systems

Hu, Songbo, Zhou, Han, Yuan, Moy, Gritta, Milan, Zhang, Guchun, Iacobacci, Ignacio, Korhonen, Anna, Vulić, Ivan

arXiv.org Artificial IntelligenceOct-19-2023

Achieving robust language technologies that can perform well across the world's many languages is a central goal of multilingual NLP. In this work, we take stock of and empirically analyse task performance disparities that exist between multilingual task-oriented dialogue (ToD) systems. We first define new quantitative measures of absolute and relative equivalence in system performance, capturing disparities across languages and within individual languages. Through a series of controlled experiments, we demonstrate that performance disparities depend on a number of factors: the nature of the ToD task at hand, the underlying pretrained language model, the target language, and the amount of ToD annotated data. We empirically prove the existence of the adaptation and intrinsic biases in current ToD systems: e.g., ToD systems trained for Arabic or Turkish using annotated ToD data fully parallel to English ToD data still exhibit diminished ToD task performance. Beyond providing a series of insights into the performance disparities of ToD systems in different languages, our analyses offer practical tips on how to approach ToD data collection and system development for new languages.

multilingual task-oriented dialogue system, performance disparity, systematic study

arXiv.org Artificial Intelligence

2310.12892

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language (0.53)

Add feedback

A Systematic Study of Joint Representation Learning on Protein Sequences and Structures

Zhang, Zuobai, Wang, Chuanrui, Xu, Minghao, Chenthamarakshan, Vijil, Lozano, Aurélie, Das, Payel, Tang, Jian

arXiv.org Artificial IntelligenceOct-18-2023

Learning effective protein representations is critical in a variety of tasks in biology such as predicting protein functions. Recent sequence representation learning methods based on Protein Language Models (PLMs) excel in sequence-based tasks, but their direct adaptation to tasks involving protein structures remains a challenge. In contrast, structure-based methods leverage 3D structural information with graph neural networks and geometric pre-training methods show potential in function prediction tasks, but still suffers from the limited number of available structures. To bridge this gap, our study undertakes a comprehensive exploration of joint protein representation learning by integrating a state-of-the-art PLM (ESM-2) with distinct structure encoders (GVP, GearNet, CDConv). We introduce three representation fusion strategies and explore different pre-training techniques. Our method achieves significant improvements over existing sequence- and structure-based methods, setting new state-of-the-art for function annotation. This study underscores several important design choices for fusing protein sequence and structure information. Our implementation is available at https://github.com/DeepGraphLearning/ESM-GearNet.

joint representation learning, protein sequence and structure, systematic study

arXiv.org Artificial Intelligence

2303.06275

Genre: Research Report (0.40)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.89)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

A Systematic Study on Quantifying Bias in GAN-Augmented Data

Liu, Denis

arXiv.org Artificial IntelligenceAug-23-2023

Generative adversarial networks (GANs) have recently become a popular data augmentation technique used by machine learning practitioners. However, they have been shown to suffer from the so-called mode collapse failure mode, which makes them vulnerable to exacerbating biases on already skewed datasets, resulting in the generated data distribution being less diverse than the training distribution. To this end, we address the problem of quantifying the extent to which mode collapse occurs. This study is a systematic effort focused on the evaluation of state-of-the-art metrics that can potentially quantify biases in GAN-augmented data. We show that, while several such methods are available, there is no single metric that quantifies bias exacerbation reliably over the span of different image domains.

artificial intelligence, dataset, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2308.13554

Country: North America > United States > Arizona (0.05)

Genre: Research Report (0.65)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.36)

Add feedback

Towards Intercultural Affect Recognition: Audio-Visual Affect Recognition in the Wild Across Six Cultures

Mathur, Leena, Adolphs, Ralph, Matarić, Maja J

arXiv.org Artificial IntelligenceOct-31-2022

In our multicultural world, affect-aware AI systems that support humans need the ability to perceive affect across variations in emotion expression patterns across cultures. These systems must perform well in cultural contexts without annotated affect datasets available for training models. A standard assumption in affective computing is that affect recognition models trained and used within the same culture (intracultural) will perform better than models trained on one culture and used on different cultures (intercultural). We test this assumption and present the first systematic study of intercultural affect recognition models using videos of real-world dyadic interactions from six cultures. We develop an attention-based feature selection approach under temporal causal discovery to identify behavioral cues that can be leveraged in intercultural affect recognition models. Across all six cultures, our findings demonstrate that intercultural affect recognition models were as effective or more effective than intracultural models. We identify and contribute useful behavioral features for intercultural affect recognition; facial features from the visual modality were more useful than the audio modality in this study's context. Our paper presents a proof-of-concept and motivation for the future development of intercultural affect recognition systems, especially those deployed in low-resource situations without annotated data.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2208.00344

Country:

North America > United States > California (0.14)
North America > United States > Nebraska (0.04)
South America > Chile (0.04)
(5 more...)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)
Information Technology > Artificial Intelligence > Vision > Face Recognition (0.68)
Information Technology > Artificial Intelligence > Cognitive Science > Emotion (0.68)

Add feedback